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(S) Gestural indicators for selecting graphic objects. 

(57) A graphical imaging system, wherein the rough location, size and shape of objects in the image is 
summarized by a first characteristic descriptor, representing a parametric "pose" computed for each 
object (32). Next, a second characteristic descriptor, i.e. a "gesture matching" Hinction, is provided in 
order to select the single object, or else the set of objects, that best comports with the user's selection 
gesture. When most closely matched, these key characteristic descriptors permit a simple and natural 
user gesture (30) to distinguish among a large set of graphic objects that may overlap spatially. User 
gestures can be simple slashes passing through the object, or quick, coarse approximations of objects' 
shapes. 
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This invention pertains to a display editing system enabling users to edit images, especially images of text, 
graphic diagrams, and freehand drawings. Typically, the image being edited is displayed to the user on a com- 
puter screen or other imaging device, and the user performs operations by typing strokes at a keyboard and/or 
by making gestures and pressing buttons using one or more pointing devices such as a mouse or stylus. 

5 Among the most important user operations are those enabling the user to select an object or objects in 

the image to which further operations subsequently will be applied, e.g. move object, delete object, change 
object size, etc. The objects) selectable may include any combination of the following: single characters, single 
words, lines or paragraphs of alpha numeric text such as found in an image of a page of text; graphic symbols, 
arrowheads, and geometric figures including lines, circles, and rectangles such as found in graphic drawings; 

10 and contiguous isolated strokes, stroke fragments bounded by corners, and stroke fragments bounded by 
junctions such as found in freehand sketches. The issue of determining which visible items in an image may 
be made available as selectable image objects is the subject of our copending application based on U.S. Serial 
No. 08/1 01 ,647. In that case, a graphic editing application program maintains a set of selectable image objects 
during the course of an image editing session. This invention relates to enabling the user from time to time to 

15 select for further processing one or more of these objects as conveniently as possible. 

An important problem faced by any user-interactive image editing tool is therefore the determination of 
which object or objects in the image are intended by the user to be indicated by a given set of keystroke and/or 
stylus gesture commands. Keystrokes are capable of specifying symbolic labels or identities unambiguously, 
but they are an awkward and unnatural interface for the inherently spatial information contained in a displayed 

20 image. Conversely, gestures made with a pointing device are convenient for specifying spatial information in 
the form of locations on an imaging surface, but can lead to ambiguous specification of the object(s) the user 
intends to select when several objects occupy the same region of the image. Stylus-based selection can be 
done, for example only, by the following means: (1) having the user touch the stylus or press a mouse button 
at a single point in the image whereafter the program selects the object or object's whose location parameter 

25 lies at or nearest to that point; (2) having the user touch the stylus or press a mouse button at a point in the 
image whereafter the program selects the object(s) whose spatial extent most nearly approaches the location 
specified; (3) having the user draw a closed or nearly closed curve whereafter the program either selects all 
objects sufficiently enclosed by, or else ail objects whose location parameters are enclosed by, the closed 
curve. 

30 The first two methods fail to support user selection of objects in a very important situation, namely, when 

the set of objects among which the user may wish to select, shares overlapping support at the level of marks 
in the image. Here, the user may wish to select just a vertical line, just a horizontal line, or an entire corner 
shape. Specifying a single point on one of these lines does not provide sufficient information to distinguish 
among the alternatives. Moreover, the third method (encircling intended objects) becomes unacceptable when 
35 additional objects are found within the region enclosing the desired objects. 

The object of the present invention is to provide a method and apparatus which will interpret users' simple, 
natural gestures in order to distinguish the object(s) in an image that the user intends to select. 

In accordance with one aspect, the invention conjoins two key concepts. First, the rough location, size, 
and shape of objects in the image is summarized by a first characteristic descriptor, representing a parametric 
40 "pose" computed for each object. Next, a second characteristic descriptor, i.e. a "gesture matching" function, 
is provided in order to select the single object, or else the set of objects, that best comports with the user's 
selection gesture. When most closely matched, these key characteristic descriptors permit simple and natural 
user gestures to distinguish among a large set of graphic objects that may overlap spatially. User gestures 
can be simple slashes passing through the object, or quick, coarse approximations of objects' shapes. 
45 In accordance with a further aspect of the invention for selecting portions of objects, primitive curve or 

line segments having no crossings, junctions or sharp corners are approximately linked end to end to define 
paths. The paths are evaluated to estimate the similarity between them and a selection gesture curve, in order 
to enable selection of the path that most closely approximates the selection gesture curve. 

The invention is thus achieved by the novel union of user interface design technology with concepts from 
50 the field of computer vision. 

A system and method in accordance with the invention will now be described, by way of example, with 
reference to the accompanying drawings, in which:- 

Fig. 1 shows a display editing system in accordance with the invention. 
Figs. 2a-e show a plurality of inputted image objects. 
55 Fig. 3 shows a plurality of overlapping image objects. 

Figs. 4a and b shows a plurality of selectable objects. 
Figs. 5a-f shows the relationship between gesture and object. 
Fig. 6 shows the factor used in analyzing an image object. 
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Fig. 7 is a flow chart illustrating the pose description vector of an object on gesture. 

Fig. 8 schematically illustrates the selection function. 

Fig. 9 is an overall flow chart of the selection routine. 

Fig. 10 illustrates an aspect of selection according to the invention. 

5 One embodiment of a display editing system which is the subject of the present invention is shown gen- 

erally in Fig. 1, wherein an input/display component 10 is illustrated, along with a data input device 12. The 
input/display component is a combination display device as well as an input device. Thus input data entered 
by the input device 12 is shown on the display as entered. The input device 1 2 is any form of pointer or pointing 
device, such as a stylus, mouse, or even the operator's hand, sensed by appropriate proximity or other form 

10 of sensing devices. Other forms of input, such as scanned hard copy is also employable. Thus, data inputted 
by the input device 12 to the screen 1 0 and displayed on the screen is entered by means of a digitizer 14 which 
in turn drives a display driver 16 for creating images on the screen in real time as data is entered. The data 
digitized by the digitizer 14 is conveyed to a CPU 18, wherein it is processed by under control of a program 
stored in the program memory 20, and the results stored in a data memory 22. The display screen 10 can be 

is employed to perform editing or drafting functions as desired by means of a menu selection 24, through the 
input device 12, and which is displayed directly on the display screen 10, along area 24. 

The foregoing digitizer input and display technique is representative of any of a plurality devices for the 
input and display of image objects, examples of which are readily apparent in the prior art 

Fig. 8 illustrates generally the procedures employed in accordance with the invention to select an image 

20 object, i.e. the selecting of an object or objects in an image that are best indicated by a given user-input se- 
lection gesture. To accomplish this, an overall gesture matching function, shown in Fig. 8, chooses among the 
outputs of a cadre of independent subf unctions. Each subf unction provides for a characteristic style or format 
of user selection gestures. For example, one subf unction may provide for the selection of a set of objects by 
encircling them, while another may let the user select a single object by drawing a slash through it. 

25 Each subf unction performs object-gesture matching under its own set of matching criteria, and it computes 

two items. The first item is the single object, or set of objects, that best matches the selection gesture, under 
that subf unction's criteria. The second item is a subf unction-confidence-score that estimates the "confidence 
or "likelihood" of this subfunction as the gesture selection format intended by the user. The overall gesture 
matching function compares subf unction-confidence-scores across gesture matching subf unctions, and re- 

30 turns the object(s) selected by the highest scoring subfunction (as long as this falls above a threshold value). 
As shown in Fig. 8, user selection gesture 96 and precompiled or currently constructed selectable graphical 
objects 98 are each processed through a series of selection subfunction operations 102. The output of each 
selection subfunction is a series of selected objects 1 04 and the respective confidence score 1 06 representing 
each of those objects. In a two step process, described further below, the series of objects scores are then 

35 gated in a final analysis 108 to produce the best match of an object for a gesture. 

The architecture of this overall gesture matching function permits the present invention to incorporate an 
extensible set of object selection techniques. The user simply executes a selection gesture and the system 
infers user intent by comparing subf u net ion-confidence- scores across the available object selection subfunc- 
tions. In order that subfunction-conf idencescores may be compared across subfunctions in a meaningful way, 

40 in their design they must be calibrated to a common standard. Under preferred standard, each score ranges 
between 0 and 1 , where 0 indicates that subfunction has no confidence that the object(s) it returns corresponds 
to the user's intent, while 1 indicates absolute certainty. For example, an extremely simple measure of confi- 
dence score for an "encircling" gesture may be expressed as one minus the weighted ratio of the distance be- 
tween the curve's endpoints and the curve's length. 

45 Two selection subfunctions are disclosed herein that may be used as selection subfunctions 102 to pro- 

duce selected objects 104 in the above architecture. It is apparent, however, that the invention is not limited 
to these two embodiments. First, a "pose-matching" routine results in a selection of a single object that is most 
similar to (least different from) the location, orientation, size, and shape of the user selection gesture. This 
routine is especially useful for singling out objects found in close or overlapping proximity to many other spu- 

50 rious objects. Second, a "path-matching" routine selects and computes a set of objects in the image that best 
corresponds to the path of a selection gesture. This routine is especially useful for quickly selecting several 
objects lying along a curvilinear path. It is of course apparent that other techniques may alternatively be em- 
ployed. 

55 POSE MATCHING SELECTION 

Fig. 2a-e, illustrates one aspect of the invention, wherein Fig. 2a is a free hand example of a line drawing 
graphic image or a structural graphic image. Figs. 2b-e each show objects, as indicated by the solid lines, that 

3 
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a user might wish to select from the image. 

Referring to Fig. 3, a plurality of overlapping image objects drawn on a display screen is illustrated. Object 

selection by conventional spatial location fails when the objects lie in roughly the same location on the imaging 

surface, as do the rectangle, circle and triangle in this figure. 
5 Referring to Figs. 4a and 4b, selection by pointing to a point on an object fails when objects share support. 

Thus, in Fig. 4a, three selectable objects are constructed from two line segments. Conventional selection of 

the horizontal segment does not distinguish between selection of only the horizontal segment versus selection 

of the V. In Fig. 4b, conventional encircling gestures fail when spurious objects fall within the region enclosed 

by the target selected objects. 
10 With respect to Figs. 5a-f, selection of one of a plurality of overlapping image objects, in accordance with 

the invention, is made by appropriately coupling a gesture to an object Thus, in Fig. 5a, the object selected 

by gesture 30 is the rectangular object 32. In Fig. 5b, circle gesture 34 selects the circular object 36. In Fig. 

5c, the line gesture 38 selects the triangular object 40. In Fig. 5d, gesture 41 selects the short line segment 

42. In Fig. 5e, gesture 44 selects the long line segment 46 passing through the box 48. In Fig. 5f, the circle 
15 gesture 50 selects the box 52. Note that in Fig. 5c, the gesture 38 need not even closely resemble the object 

selected, but is just a shorthand indication of location, orientation and size of the target object Similarly, a 

horizontal gesture could be used to select the rectangle in Fig. 5a. 

For a typical graphic object, and for the shape of a selection gesture made with a pointing device, referring 

to Fig. 6, an effective parametric description of the pose, that is, the object or gesture's 60 rough location, ori- 
20 entation, size, and shape, consists of the following five parameters: the x-location (x) on the abscissa 62, y- 

location (y) on the ordinate 64, orientation 66, scale factors, such as length 68 and or width 69, and aspect 

ratio (r) parameters of an oriented bounding box 70 associated with the object (with respect to a fixed global 

coordinate system). 

The x-y location may be computed as the centroid of the ON pixels of the object. The orientation parameter 
25 may be taken as the orientation of the principal axis of inertia through the centroid. The scale parameter may 
be computed as a linear function of the logarithm of the length of the object (linear extent along the principal 
axis of inertia). The aspect ratio may be taken as the maximum of a fixed minimum value and the ratio of the 
object's width (linear extent along the axis perpendicular to the principal axis of inertia) and length. The pose 
description vector is extensible and may be imbued with more parameters than the five named above in order 
30 to provide more distinguishing information. As shown in Fig. 7, the data input 60 is employed to derive x-y lo- 
cation 82, orientation 84, scale 86, aspect ratio 88, and any additional parameters 90 to define a pose descrip- 
tion vector 92 which is a characteristic description representative of an object or gesture. For example, addi- 
tional parameters could include higher order geometric moments, angular closure, and contour texture meas- 
ures. The pose description vector defines, for each image object, a first characteristic descriptor, stored in lo- 
35 cation 94. It also defines, for each user selection gesture, a second characteristic descriptor, stored in a location 
96. A comparison and analysis function 98 is then performed between the defined graphical objects 3nd the 
user selection gesture to determine a confidence level for determining that object which best meets *\e user 
selection gesture. The resultant selected object 100 is then defined. The confidence level and selects proc- 
ess is described in more detail in connection with Fig. 8. 

40 

POSE-MATCHING ROUTINE 

The pose-matching routine (Fig. 9) for assessing the similarity/difference of two poses must take account 
of the relative significance of the location, orientation, scale, and aspect ratio parameters jointly in •-: highly 
45 nonlinear fashion; a simple Euclidian distance measure is inappropriate!. For example, the differen : n the 
orientation parameters of poses is significant when the aspect ratio of both objects is relatively high tnear 0) 
but becomes insignificant when either object displ ays low aspect ratio (near 1 ). For the purposes of the present 
implementation of the invention, the following is the formula for the pose difference D between two poses, 

50 

P\ 2{xi t yi,0,, Ji,ri} and P a = {x 3l y a ,0 3t s 7t r 7 ): 

D = 1 - (1 - f lPl )(l - hp x )(\ - / P )(l - - /,) 

55 

where 

4 
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30 The constant parameters are assigned the following values: 

Pi = .75; p 2 = .2; p 3 = .693. 

The difference formula is incorporated in the pose-matching routine by prescreening candidate objects 
on the basis of spatial location using a spatially- and scale-indexed data structure which is described in the 
aforementioned copending application based on U.S. S.N. 08/101,646. The pose difference between the se- 

35 lection gesture and each of these preselected objects is computed in turn, and the object with the least dif- 
ference is returned. Since the pose match distance D above ranges from 0 (perfect match) to 1 (no match), 
the subf unction-confidence-score may be taken simply as 1 - D. 

The overall matching operation as described in Fig. 9, first shows an input gesture applied to the system, 
block 120. Next, the system creates pose parameters for the input gesture, block 122. Then, from a collection 

40 of objects stored in a data structure, block 1 24, candidate objects are selected based on proximity to the input 
gesture, block 126. Next, the system does a lookup for pose parameters for candidate objects, block 128. The 
system then compares the sets of pose parameters for the input gesture with those of candidate objects, block 
1 30, and chooses the most similar object. 



45 PATH MATCHING SELECTION 

To effect the path matching operation as shown in Fig. 9, this invention provides a path-matching selection 
subf unction which collects a set of image objects that best accounts for the path of the selection-gesture-cur- 
ve. For this subfunction, first define a path to be a sequence of primitive curve segments (curve segments 
50 containing no crossings, junctions, or sharp corners) approximately linked end-to-end. Next, define a path- 
quality measure that estimates the similarity between a path and a selection-gesture-curve. In more detail, 
the path-quality measure for any path is computed by a set of steps in a routine as follows: 

1 . Begin with the curve-segment deemed to be the beginning of the path sequence. Call this the "leftmost" 
segment. Project the "leftmost" end of this segment onto the selection-gesturecurve, as shown in figure 
55 10. 

1a. Initialize the variable, current-segment to be the leftmost segment 

1b. Initialize the variable, last-projectionproportion to be the proportion of the selection-gesture-curve to 
the left of the projection point. 

5 
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1c. Initialize the variable, best-possible-score s b to be the proportion of the selection-gesture-curve to the 

right of the projection point. 

1d. Initialize the variable, actual-score, s a to 0. 

2. Compute a variable called fit-quality-factor, fq for the current-segment, as follows, 

5 

/ f »l-mAx(0.(min,l(£ij^))) 

1Q where p 4 is a tunable parameter for which a value of .5 is satisfactory, d m is the maximum distance 

between a point on the current-segment and the selection-gesture-curve, and 1 is the length of the current- 
segment. 

3. Set the variable, next- projection- proportion, p n to be the proportion of the selection-gesture-curve to 
the left of the projection of the "right" end of the current-segment onto the selection-gesture-curve. 

15 4. Set a variable called this-segment-selection-curvearray-f raction, f r to be the difference between the 

next-projection-proportion and the last-projection-proportion. 

5. Set a variable called this-segment-projection-length, to be the curve distance along the selection- 
gesture-curve between the projections of the left and right ends of the current-segment. 

6. Set a variable called excess-segment-length-cost c d to be the value. 



20 



i 



25 where p 5 is a tunable parameter for which the value.25 is satisfactory. 

7. Update the variable actual-score 



30 8. Update the variable, best- possible-score: 

n » + 1 - Pm 

9. Set the leftmost-segment to be the next segment in the path sequence, and to proceed to step 2. 

35 For the main body of the algorithm, maintain a data structure called the path-list consisting of a list of paths 

along with their actual scores s a , and their best-possiblescores, s b . The basic search algorithm proceeds as 
follows. 

1. Initialize the path-list with the set of curve-segments possessing an endpoint lying within a threshold 
distance of the "leftmost" endpoint of the selection-gesture- path. 
40 2. Initialize the variable current-actual-score S a to be the maximum actual score s a over the paths in path- 

list. 

3. Initialize the variable current-best-possible-score S b to be the maximum best-possible-score s b over the 
paths in the path-list. 

4. Set the variable, best-possible-path to be the path in path-list having the greatest best-possible-score. 
45 5. Perform an expansion operation on the best-possiblepath. This consists of extending the path with all 

curvesegments one of whose endpoints falls within a threshold distance of the rightmost end of the right- 
most segment in the path. The result of this operation is a list of new paths, one for each such extension 
segment found. Call this result, currentexpanded-path-lisL 

6. Compare each path in path-list with each path in current-expanded-path-list, and remove any path 
so whose rightmost curve-segment is the same as the rightmost curve-segment of any other path which con- 
tains fewer curve-segments. 

7. Update the variable, path-list, to be the union of path-list and expanded-path-lisL 

8. Examine each path in path-list. Update the variable current-acutal-score, S a , to be the best-actual-score 
among these, and set the variable, best-actual-path to be the corresponding path. 

55 9. Eliminate from path-list any path whose best-possible score s b is less then the current-actual-score S a . 

10. Eliminate from path-list any path which has already been expanded and for which its actual-score s a 
falls below the current-actual-score S a . 

11 . If ony one path remains, output it as the best fitting path and output its actual score s a as the confidence 

6 
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value. If more than one path remains, proceed to step 5. 

Certain changes and modifications of the embodiment of the invention herein disclosed will be readily ap- 
parent to those of average skill in the art. Moreover, uses of the invention other than for coordinate determi- 
nation in a digitizer system will also be readily apparent to those of skill in the art. 



Claims 

1. A display editing system for selecting at least one image object from a plurality of image objects, which 
10 may be overlapping, for editing, comprising 

first means for entering a plurality of image objects on a display surface, 

second means responsive to each entered object for creating and storing a first characteristic de- 
scriptor representative of said object, 

third means for entering a selection gesture for selecting at least one of said objects, 
15 fourth means responsive to said selection gesture for creating and storing a second characteristic 

descriptor representative of said selection gesture, and 

fifth means for choosing at least one of said first characteristic descriptors which most closely cor- 
responds to said second characteristic descriptor, thereby selecting at least one image object. 



20 



25 



30 



2. The system of claim 1 , wherein said second means includes means with respect to a fixed location for 
determining the X-Y location of said object, for determining the orientation of said object, for determining 
a function of the length of said object, and for determining the aspect ratio of said object, and means for 
generating, from said foregoing determinations, said first characteristic descriptor. 

3. The system of claim 1 or claim 2 wherein said fourth means includes means with respect to a fixed location 
for determining the X-Y location of said gesture, for determining the orientation of said gesture, for de- 
termining the function of the length of said gesture, and for determining the aspect ratio of said gesture, 
and means for generating, from said foregoing determinations, said second characteristic descriptor. 

4. The system of any one of claims 1 to 3 wherein said fifth means includes a means for selecting a set of 
objects corresponding to said gesture, means for calculating a difference factor between each said object 
and said gesture, and means for selecting that one of said first characteristic descriptors which represents 
the least difference factor between one of said objects and said gesture. 

5. A method of selecting a graphical object in a display system, which object is most closely related to a se- 
35 lection gesture performed in relation to the display, comprising the steps of 

inputting a selection gesture, creating a characteristic descriptor for said input gesture, selecting 
from a collection of stored objects the object or objects most closely proximate to said input gesture, re- 
trieving characteristic descriptors representative of said selected objects, and comparing the two sets of 
descriptors for choosing the object corresponding most closely to the selection gesture. 

40 

6. The method claim 5, including creating said characteristic descriptor of an object by determining, with re- 
spect to a fixed location, the X-Y location of said object, determining the orientation of said object, deter- 
mining a function of the length of said object, determining the aspect ratio of said object, and generating, 
from said foregoing determinations, said characteristic descriptor of the object 

45 

7. The method of claim 5 or claim 6 wherein creating said characteristic descriptor of the gesture includes, 
with respect to a fixed location, determining the X-Y location of said gesture, determining the orientation 
of said gesture, determining the function of the length of said gesture, determining the aspect ratio of said 
gesture, and generating, from said foregoing determinations, said characteristic descriptor of the gesture. 

so 

8. A display editing system for selecting a path formed of a collection of primitive objects suitably related to 
each other of an image, from a plurality of paths formed of line segments, comprising: 

first means for entering said plurality of primitive objects on a display surface, 

second means for entering a selection gesture curve, 
55 third means for generating a path list corresponding to a plurality of incomplete paths representing 

a sequence of primitive objects; 

fourth means responsive to entry of said selection gesture curve for comparing said entered se- 
lection gesture curve with partial curves of said path list, and 
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fifth means for choosing at least one of said paths which most closely corresponds to said entered 
selection gesture curve. 

9. The system of claim 8 wherein said primitive objects comprise a sequence of primitive line segments ap- 
5 proximately linked together, said primitive line segments each having no crossing, junctions or sharp cor- 
ners. 

1 0. The system of claim 9 wherein said fifth means includes means for adding a chosen path to a list of paths 
until the final line of paths represents the best of all chosen paths. 

10 
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(54) Gestural indicators for selecting graphic objects 



(57) A graphical imaging system, wherein the rough 
location, size and shape of objects in the image is sum- 
marized by a first characteristic descriptor, representing 
a parametric "pose" computed for each object (32). 
Next, a second characteristic descriptor, i.e. a "gesture 
matching" function, is provided in order to select the sin- 
gle object, or else the set of objects, that best comports 



with the user's selection gesture. When most closely 
matched, these key characteristic descriptors permit a 
simple and natural user gesture (30) to distinguish 
among a large set of graphic objects that may overlap 
spatially. User gestures can be simple slashes passing 
through the object, or quick, coarse approximations of 
objects' shapes. 
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